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Abstract 

The full-dimensional (metric, Euclidean, least squares) multidimensional scaling 
stress loss function is combined with a quadratic external penalty function term. The 
trajectory of minimizers of stress for increasing values of the penalty parameter is then 
used to find (tentative) global minima for low-dimensional multidimensional scaling. 

This is illustrated with several one-dinrensional and two-dimensional examples. 
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Note: This is a working paper which will be expanded/updated frequently. All suggestions 
for improvement are welcome. The directory deleeuwpdx.net/pubfolders/penalty has a pdf 
version, a html version, the bib hies, the complete Rmd hie with the code chunks, and the R 
source code. 


Introduction 


Full-dimensional Scaling (FDS) was introduced by De Leeuw (1993). De Leeuw, Groenen, 
and Mair (2016) discuss it in some detail. In FDS we minimize the usual Multidimensional 
Scaling (MDS) least squares loss function hrst used by Kruskal (1964a) and Kruskal (1964b). 

°(z) = \ EE “«(%-<%(z )) 2 (i) 

l<i<7<n 

over all n x n configuration matrices Z. The loss at Z is often called the stress of configuration 
Z. More generally we define pMDS as the problem of minimizing (1) over all n x p matrices. 
Thus FDS is the same as nMDS. If a configuration Z has n columns (i.e. is square) it is called 
a full configuration. 


In (1) the matrices W = {w t j} and A = {h^-} of weights and dissimilarities are non-negative, 
symmetric, and hollow. To simplify matters we suppose both W and A have positive off- 
diagonal elements. The matrix D(X) = {d^fiZ)} has the Euclidean distances between the 
rows of the configuration Z. Thus 


dij(Z) — \[(zi z j)'( z i z j)- 

We now introduce some standard MDS notation, following De Leeuw (1977). 
matrix V = {%} by 

_/-Wij if i j, 

Dj | ^—. a ■ r ■ 

[Ej=i if i=j, 

and the matrix valued function B(Z) = {bij(Z)} by 


Dehne the 


( 2 ) 


bij(Z) 


-WijCifiZ) if i 

E"=i WijeifiZ) if i=j, 


( 3 ) 


where E(Z) = {eij(Z)} is dehned as 


ei j(Z) = 


Ujj 

dij ( Z ) 

0 


if difiZ) > 0, 
otherwise. 
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( 4 ) 








Note that V and B(Z ) are both positive semi-definite and doubly-centered. Matrix V has 
rank n — 1. If all off-diagonal dij(Z) are positive then B{Z) has rank n — 1 for all Z. Note that 
De Leeuw (1984) established that near a local minimum of stress all off-diagonal distances 
are indeed positive. The only vectors in the null-space of both V and B(Z) are the vectors 
proportional to the vector with all elements equal to one. 

We assume in addition, without loss of generality, that 

\ E E ir 'A = !• 

l<i<7<n 

With these definitions we can rewrite the stress (1) as 

a(Z) = 1 - tr Z'B(Z)Z + ^tr Z'VZ, (5) 

and we can write the stationary equations as 

(V-B(Z))Z = 0, (6) 

or, in fixed point form, Z = V + B(Z)Z. 

Equation (5) shows, by the way, something which is already obvious from (1). Distances are 
invariant under translation. This is reflected in B(Z) and V being doubly-centered. As a 
consequence we usually require, again without loss of generality, that Z is column-centered. 
And that implies that Z has rank at most n — 1, which means that FDS is equivalent to 
minimizing stress over all n x (n — 1) matrices, which we can assume to be column-centered as 
well. Configurations with n — 1 columns can be called full configurations as well. In addition, 
distances are invariant under rotation, and consequently if Z solves the stationary equations 
with value cr(Z) then ZK solves the stationary equations for all rotation matrices K, and 
a{ZK) = This means there are no isolated local minima in configuration space, each 

local minimum is actually a continuum of rotated matrices in M nxn . This is a nuisance in 
the analysis of FDS and pMDS that is best dealt with by switching to the parametrization 
outlined in De Leeuw (1993). 


Convex FDS 

Instead of defining the loss function (1) on the space of all n x n configuration matrices Z we 
can also define it over the space of all positive semidefinite matrices C of order n. This gives 

a(C) = 1 - "''A; v' r " + c jj ~ 2c ij + A r VC ' A 

The Convex Full-dimensional Scaling (CFDS) problem is to minimize loss function (7) over 
all C > 0. Obviously if Z minimizes (1) then C = ZZ' minimizes (7). And, conversely, if C 
minimizes (7) then any Z such that C = ZZ' minimizes (1). 

The definition (7) shows that the CFDS loss function is a convex function on the cone of 
positive semi-definite matrices, because the square root of a non-negative linear function of 
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the elements of C is concave. Positivity of the weights and dissimilarities implies that loss 
is actually strictly convex. The necessary and sufficient conditions for C to be the unique 
solution of the CFDS problem are simply the conditions for a proper convex function to 
attain its minimum at C on a closed convex cone (Rockafellar (1970), theorem 31.4). 

V-B(C)> 0, 

C>0, 

tr C{V-B{C)) = 0. 

The conditions say that C and V — B(C ) must, be positive semi-definite and have compli¬ 
mentary null spaces. 

By the same reasoning as in the full configuration case, we also see that CFDS is equivalent 
to maximizing (7) over all doubly-centered positive semi-definite matrices. 

If C is the solution of the CFDS problem then rank(C) is called the Goiver rank of the MDS 
problem defined by W and A (De Leeuw (2016)). Although there is a unique Gower rank 
associated with each CFDS problem, we can also talk about the approximate Gower rank by 
ignoring the small eigenvalues of C. 


FDS using SMACOF 


The usual SMACOF algorithm can be applied to FDS as well. The iterations start with Z- (>> 
and use the update rule 

Z (k+1) = V + B(Z (k) )Z (k) , (8) 

where V + is the Moore-Penrose inverse of V, and is consequently also doubly-centered. This 
means that all Z^ in the SMACOF sequence, except possibly Z^°\ are column-centered and 
of rank at most n — 1. Equation (8) also shows that if Z 1 ' 0 ' 1 is of rank p < n — 1 then all Z^ 
are of rank p as well. 

De Leeuw (1977) shows global convergence of the SMACOF sequence for pMDS, generated by 
(8), to a stationary point, i.e. a point satisfying (V — B(Z))Z = 0. This result also applies, of 
course, to nMDS, i.e. FDS. If Z is a solution of the stationary equations then with C = ZZ’ 
we have both (V — B{C))C = 0 and C > 0, but since we generally do not have V — B(Z) > 0, 
this does not mean that C solves the CFDS problem. 


In fact, suppose the unique CMDS solution has Gower rank r > 2. Start the SMACOF FDS 


iterations (8) with Z of the form Z^ = 
p < r. All Z^G will be of this form and wi 


x(°) 


0 


, where X ® is an n x p matrix of rank 


1 also be of rank p, and all accumulation points Z 
of the SMACOF sequence will have this form and rank(Z) < p. Thus C = ZZ' cannot be 
the solution of the CMDS problem. 


The next result shows that things are allright, after all. Although stress in FDS is certainly 
not a convex function of Z, it remains true that all local minima are global. 


Lemma 1: [Expand] If FDS stress has a local minimum at 

and the zero block isnxg with q > 1, then 


X 


, where X is n x p 


4 



1: Va(X) = {V- B(X))X = 0. 
2: V 2 a(X) > 0. 

3: V - B(X) > 0. 


Proof: We use the fact that stress is differentiable at a local minimum (De Leeuw (1984)). 


If Z = 


X 


0 


+ e 


P 


Q then we must have cr(Z) > a(X) for all P and Q. Now 


a(Z) = cr(X) + e tr P'Va(X) + 

+ ^e 2 V 2 a(X)(P, P ) + ^e 2 tr Q'(V - B(X))Q + o(e 2 ). (9) 


The lemma follows from this expansion. ■ 

Theorem 1: [FDS Local Minima] If stationary point Z of FDS is a local minimum, then 
it also is the global minimum, and C = ZZ' solves the CFDS problem. 


Proof: We start with a special case. Suppose Z is a doubly-centered solution of the FDS 
stationary equations with rank(Z) = n — 1. Then (V — B(Z))Z = 0 implies V = B(Z), 
which implies S t] = dij(Z) for all i,j. Thus cr(Z) = 0, which obviously is the global minimum. 


Now suppose Z is a doubly-centered local minimum solution of the FDS stationary equations 
with rank(Z) = r < n — 1. Without loss of generality we assume Z is of the form 
Z = X \ 0 , with X an n x r matrix of rank r. For C = ZZ 1 to be a solution of the CFDS 
problem it is necessary and sufficient that V — B(Z) > 0. Lemma 1 shows that this is indeed 
the case at a local minimum. ■ 


Corrollary 1: [Saddle] A pMDS solution of the stationary equations with Z singular is a 
saddle point. 

Corrollary 2: [Nested] Solutions of the stationary equations of pMDS are saddle points of 
qMDS with q > p. 

The proof of lemma 1 shows that for any n x p configuration Z, not just for solutions of the 
FDS stationary equations, if V — B{Z) is indefinite we can decrease loss by adding another 
dimension. If Z is a stationary point and V — B(Z) is positive semi-definite then we actually 
have found the CFDS solution, the Gower rank, and the global minimum (De Leeuw (2014)). 


Penalizing Dimensions 

In Shepard (1962a) and Shepard (1962b) a nonmetric multidimensional scaling technique is 
developed which minimizes a loss function over configurations in full dimensionality n — 1. 
In that sense the technique is similar to FDS. Shepard’s iterative process aims to maintain 
monotonicity between distances and dissimilarities and at the same time concentrate as much 
of the variation as possible in a small number of dimensions (De Leeuw (2017)). 

Let us explore the idea of concentrating variation in p < n — 1 dimensions, but use an 
approach which is quite different from the one used by Shepard. We remain in the FDS 
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framework, but we aim for solutions in p < n — 1 dimensions by penalizing n — p dimensions 
of the full configuration, using the classical Courant quadratic penalty function. 


Partition a full configuration Z 
n x (n — p ). Then 


X 


Y 


with X of dimension n x p and Y of dimension 


a(Z) = 1 - tr X'B(Z)X - tr Y'B(Z)Y + hr X’VX + hr Y’VY. 


( 10 ) 


Also define the penalty term 

r(Y) = Mr Y’VY, 

and penalized stress 

7r(Z, A) = <y(Z) + A t(Y). 


in) 

( 12 ) 


Our proposed method is to minimize penalized stress over Z for a sequence of values 
0 = Ai < A 2 < • • • A m . For A = 0 this is simply the FDS problem, for which we know 
we can compute the global minimum. For fixed 0 < A < +00 this is a Penalized FDS or 
PFDS problem. PFDS problems with increasing values of A generate a trajectory Z( A) in 
configuration space. 


The general theory of exterior penalty functions, which we review in appendix A of this paper, 
shows that increasing A leads to an increasing sequence of stress values a and a decreasing 
sequence of penalty terms r. If A —>• +00 we approximate the global minimum of the FDS 
problem with Z of the form Z = 


X 


0 


i.e. of the pMDS problem. This assumes we do 
actually compute the global minimum for each value of A, which we hope we can do because 
we start at the FDS global minimum, and we slowly increase A. There is also a local version 
of the exterior penalty result, which implies that A —>• 00 takes us to a local minimum of 
pMDS, so there is always the possibility of taking the wrong trajectory to a local minimum 
of pMDS. 


Local Minima 

The stationary equations of the PFDS problem are solutions to the equations 

(13) 

(14) 


(V-B(Z))X = 0, 
{{1 + \)V - B{Z))Y = 0. 


We can easily related stationary points and local minima of the FDS and PFDS problem. 

Theorem 2: [PFDS Local Minima] 

1: If X is a stationary point of the pMDS problem then Z = [X | 0] is a stationary point of 
the PFDS problem, no matter what A is. 

2: If Z — [X | 0] is a local minimum of the PFDS problem then X is a local minimum of 
pMDS and (1 + A)P — B(X) > 0, or A > ||P + .B(X)|| 00 — 1, with || • the spectral radius 
(largest eigenvalue). 
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Proof: 


Part 1 follows by simple substitution in the stationary equations. 

Part 2 follows from the expansion for Z = [X + eP \ eQ]. 

7 r(Z) = tt(X) + e tr P'Va(X) + 

+ ^e 2 V 2 a(X)(P, P ) + ^e 2 tr Q'(( 1 + X)V - 5(X))Q + o(e 2 ). (15) 

At a local minimum we must have T>a(X) = 0 and T> 2 a(X)(P, P) > 0, which are the necessary 
conditions for a local minimum of pMDS. We also must have ((1 + A)B — B(X)) > 0. ■ 

Note that the conditions in part 2 of theorem 2 are also sufficient for PFDS to have a local 
minimum at [A" j 0], provided we eliminate translational and rotational indeterminacy by a 
suitable reparametrization, as in De Leeuw (1993). 


Algorithm 

The SMACOF algorithm for penalized stress is a small modification of the unpenalized FDS 
algorithm ( 8 ). We start our iterations for A j with the solution for Aj_i (the starting solution 
for Ai = 0 can be completely arbitrary). The update rules for fixed A are 


X (fc+i) = v + B(Z^)X {k \ (16) 

y(fc+i) = _l_v+ B(Z (k) )Y {k) . (17) 

Thus we compute the FDS update Z^ k+l ^ = V + 5 (ZA')) 2 '( fc ) and then divide the last n — p 
columns by 1 + A. 

Code is in the appendix. Let us analyze a number of examples. 


Examples 

This section has a number of two-dimensional and a number of one-dimensional examples. 
The one-dimensional examples are of interest, because of the documented large number of 
local minima of stress in the one-dimensional case, and the fact that for small and medium n 
exact solutions are available (for example, De Leeuw (2005)). By default we use seq( 0 , 1, 
length = 101 ) for A in most examples, but for some of them we dig a bit deeper and use 
longer sequences with smaller increments. 

If for some value of A the penalty term drops below the small cutoff 7 , for example 10 -10 , 
then there is not need to try larger values of A, because they will just repeat the same result. 
We hope that result is the global minimum of the 2MDS problem. 
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The output for each example is a table in which we give, the minimum value of stress, the 
value of the penalty term at the minimum, the value of A, and the number of iterations needed 
for convergence. Typically we print for the first three, the last, three, and some regularly 
spaced intermediate values of A. Remember that the stress values increase with increasing A, 
and the penalty values decrease. 

For two-dimensional examples we plot all two-dimensional configurations, after rotating to 
optimum match (using the function matchMe () from the appendix). We connect corresponding 
points for different values of A. Points corresponding to the highest value of A are labeled 
and have a different plot symbol. For one-dimensional examples we put 1: n on the horizontal 
axes and plot the single dimension on the vertical axis, again connecting corresponding points. 
We label the points corresponding with the highest value of A, and draw horizontal lines 
through them to more clearly show their order on the dimension. 

The appendix also has code for the function checkUniO, which we have used to check the 
solutions in the one dimensional case are indeed local minima. The function checks the 
necessary condition for a local minimum x = V + u, with 

n 

iii = YWijSij sign (.Xi - xj). 

3 =1 

It should be emphasized that all examples are just meant to study convergence of penalized 
FDS. There is no interpretation of the MDS results 


Chi Squares 


In this example, of order 10, the Sij are independent draws from a chi-square distribution 
with two degrees of freedom. There is no structure in this example, everything is random. 


## 

itel 

198 

lambda 

## 

itel 

5 

lambda 

## 

itel 

3 

lambda 

## 

itel 

1 

lambda 

## 

itel 

1 

lambda 

## 

itel 

4 

lambda 

## 

itel 

6 

lambda 

## 

itel 

20 

lambda 


000000 

stress 

0 

010000 

stress 

0 

020000 

stress 

0 

100000 

stress 

0 

200000 

stress 

0 

300000 

stress 

0 

310000 

stress 

0 

320000 

stress 

0 


175144 penalty 0.321138 
175156 penalty 0.027580 
175187 penalty 0.025895 
175914 penalty 0.015172 
177666 penalty 0.004941 
178912 penalty 0.000088 
178933 penalty 0.000020 
178939 penalty 0.000000 



o 


CM 


E 

T3 



- 0.10 0.00 0.10 


dim 1 

It seems that in this example the first two dimensions of FDS are already close to optimal 
for 2MDS. This is because the Gower rank of the dissimilarities is only three (or maybe four, 
the fourth singular value of the FDS solution Z is very small). 


Regular Simplex 

The regular simplex has all dissimilarities equal to one. We use an example with n = 10. for 
which the global minimum (as far as we know) of pMDS with p = 2 is a configuration with 
nine points equally spaced on a circle and one point in the center. 


## 

itel 

1 

lambda 

0.000000 

stress 

0.000000 

penalty 

0.400000 

## 

itel 

7 

lambda 

0.010000 

stress 

0.000102 

penalty 

0.375457 

## 

itel 

5 

lambda 

0.020000 

stress 

0.000425 

penalty 

0.360310 

## 

itel 

1 

lambda 

0.100000 

stress 

0.008861 

penalty 

0.258186 

## 

itel 

1 

lambda 

0.200000 

stress 

0.032285 

penalty 

0.149502 

## 

itel 

1 

lambda 

0.300000 

stress 

0.060136 

penalty 

0.076344 

## 

itel 

1 

lambda 

0.400000 

stress 

0.082466 

penalty 

0.035994 

## 

itel 

1 

lambda 

0.500000 

stress 

0.098777 

penalty 

0.013130 

## 

itel 

1 

lambda 

0.600000 

stress 

0.107410 

penalty 

0.002747 

## 

itel 

1 

lambda 

0.700000 

stress 

0.109651 

penalty 

0.000258 

## 

itel 

1 

lambda 

0.790000 

stress 

0.109872 

penalty 

0.000013 

## 

itel 

1 

lambda 

0.800000 

stress 

0.109876 

penalty 

0.000009 

## 

itel 

65 

lambda 

0.810000 

stress 

0.109880 

penalty 

0.000000 
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LO 



-0.15 -0.05 0.05 


dim 1 

Next, we look at the regular simplex with n = 4, for which the global minimum has four 
points equally spaced on a circle (i.e. in the corners of a square). We use seq(0, 1 , length 
= 101) for the A sequence. 


## 

itel 

1 

lambda 

0.000000 

stress 

0.000000 

penalty 

0.250000 

## 

itel 

2 

lambda 

0.010000 

stress 

0.000035 

penalty 

0.162295 

## 

itel 

1 

lambda 

0.020000 

stress 

0.000122 

penalty 

0.158591 

## 

itel 

1 

lambda 

0.100000 

stress 

0.003808 

penalty 

0.122344 

## 

itel 

1 

lambda 

0.200000 

stress 

0.014089 

penalty 

0.084111 

## 

itel 

1 

lambda 

0.300000 

stress 

0.028044 

penalty 

0.053366 

## 

itel 

1 

lambda 

0.400000 

stress 

0.043331 

penalty 

0.028965 

## 

itel 

1 

lambda 

0.500000 

stress 

0.056851 

penalty 

0.011482 

## 

itel 

1 

lambda 

0.600000 

stress 

0.064718 

penalty 

0.002471 

## 

itel 

1 

lambda 

0.700000 

stress 

0.066799 

penalty 

0.000203 

## 

itel 

1 

lambda 

0.800000 

stress 

0.066982 

penalty 

0.000005 

## 

itel 

1 

lambda 

0.820000 

stress 

0.066985 

penalty 

0.000002 

## 

itel 

1 

lambda 

0.830000 

stress 

0.066986 

penalty 

0.000001 

## 

itel 

1 

lambda 

0.840000 

stress 

0.066986 

penalty 

0.000001 
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The solution converges to an equilateral triangle with the fourth point in the centroid. This 
is a local minimum. What basically happens is that the first two dimensions of the FDS 
solution are too close to the local minimum. Or, what amounts to the same thing, the Gower 
rank is too large (it is n — 1 for a regular simplex) , there is too much variation in the higher 
dimensions, and as a consequence the first two dimensions of FDS are a bad 2MDS solution. 
We try to repair this by refining the trajectory, using seq(0, 1, 10001). 


## 

itel 

## 

itel 

## 

itel 

## 

itel 

## 

itel 

## 

itel 


1 lambda 

2 lambda 
1 lambda 
1 lambda 
1 lambda 
1 lambda 


0.000000 stress 
0.000100 stress 
0.000200 stress 
0.202200 stress 
0.202300 stress 
0.202400 stress 


0.000000 penalty 0 
0.000000 penalty 0 
0.000000 penalty 0 
0.028595 penalty 0 
0.028595 penalty 0 
0.028595 penalty 0 


250000 

166622 

166583 

000001 

000001 

000001 
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CO 

o 


o 


o 

I 


CO 

o 

I 


-0.3 -0.1 0.1 0.3 

dim 1 

Now the trajectories move us from what starts out similar to an equilateral triangle to the 
corners of the square, and thus we do find the global minimum in this way. It is remarkable 
that we manage to find the square even when we start closer to the triangle with midpoint. 

Intelligence 

These are correlations between eight intelligence tests, taken from the smacof package. We 
convert to dissimilarities by taking the negative logarithm of the correlations. As in the chi- 
square example, the FDS and the 2MDS solution are very similar and the PMDS trajectories 
are short. 


## 

itel 

2951 

lambda 

0.000000 

stress 

0.107184 

penalty 

7.988384 

## 

itel 

7 

lambda 

0.010000 

stress 

0.107560 

penalty 

0.685654 

## 

itel 

4 

lambda 

0.020000 

stress 

0.108528 

penalty 

0.628538 

## 

itel 

3 

lambda 

0.030000 

stress 

0.110045 

penalty 

0.573208 

## 

itel 

3 

lambda 

0.040000 

stress 

0.112449 

penalty 

0.510730 

## 

itel 

2 

lambda 

0.050000 

stress 

0.114714 

penalty 

0.464650 

## 

itel 

2 

lambda 

0.060000 

stress 

0.117623 

penalty 

0.415037 

## 

itel 

2 

lambda 

0.070000 

stress 

0.121095 

penalty 

0.364536 

## 

itel 

2 

lambda 

0.080000 

stress 

0.125010 

penalty 

0.315023 

## 

itel 

2 

lambda 

0.090000 

stress 

0.129226 

penalty 

0.267831 

## 

itel 

2 

lambda 

0.100000 

stress 

0.133589 

penalty 

0.223898 

## 

itel 

2 

lambda 

0.110000 

stress 

0.137944 

penalty 

0.183868 

## 

itel 

3 

lambda 

0.120000 

stress 

0.143921 

penalty 

0.133739 

## 

itel 

2 

lambda 

0.130000 

stress 

0.147473 

penalty 

0.106166 

## 

itel 

4 

lambda 

0.140000 

stress 

0.153215 

penalty 

0.064499 

## 

itel 

4 

lambda 

0.150000 

stress 

0.157159 

penalty 

0.037735 


c\j 

E 

T3 
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## itel 
## itel 


9 lambda 0.160000 stress 0.161434 penalty 0.010337 

72 lambda 0.170000 stress 0.163122 penalty 0.000000 



The singular values of the FDS solution are 1.78e+00, 1.36e+00, 5.94e-01, 1.47e-01, 3.29e- 
03, 1.70e-16, 6.62e-17, 4.67e-17, which shows that the Gower rank is probably five, but 
approximately two. 


Countries 


This is the wish dataset from the ’smacof 1 package, with similarities between 12 countries. 
They are converted to dissimilarties by subtracting each of them from seven. 


## 

itel 

1381 
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1 
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## 

itel 
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0.000000 stress 
0.010000 stress 
0.020000 stress 
0.100000 stress 
0.200000 stress 
0.300000 stress 
0.400000 stress 
0.500000 stress 
0.590000 stress 
0.600000 stress 
0.610000 stress 


4.290534 penalty 98.617909 
4.301341 penalty 36.137074 
4.336243 penalty 34.389851 
5.187917 penalty 23.300775 
7.539228 penalty 11.543635 
9.901995 penalty 4.963372 
11.523357 penalty 1.859569 
12.391692 penalty 0.556411 
12.696493 penalty 0.080144 
12.708627 penalty 0.060113 
12.738355 penalty 0.000000 
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dim 1 

The singular values of the FDS solution are 4.20e+00, 3.71e+00, 2.67e+00, 1.80e+00, 
1.33e+00, 6.64e-01, 5.97e-04, 4.88e-16, 2.49e-16, 2.10e-16, 1.52e-16, 3.04e-17, and the Gower 
rank is six or seven. 


Dutch Political Parties 

In 1967 one hundred psychology students at Leiden University judged the similarity of nine 
Dutch political parties, using the complete method of triads (De Gruijter (1967)). Data were 
aggregated and converted to dissimilarities. We first print the matrix of dissimilarities. 


## 


KVP 

PvdA 

VVD 

ARP 

CHU 

CPN 

PSP 

BP 

D66 

## 

KVP 

0. 

,000 

0. 

,209 

0, 

.196 

0. 

, 171 

0. 

, 179 

0. 

,281 

0. 

,250 

0. 

267 

0. 

,230 

## 

PvdA 

0 . 

,209 

0, 

,000 

0 , 

.250 

0 , 

.210 

0 . 

,231 

0. 

, 190 

0 . 

, 171 

0 . 

269 

0. 

,204 

## 

VVD 

0 . 

, 196 

0 . 

,250 

0 , 

.000 

0 , 

.203 

0 . 

, 185 

0. 

,302 

0. 

,281 

0 . 

257 

0. 

. 174 

## 

ARP 

0 . 

, 171 

0. 

,210 

0 , 

.203 

0 , 

.000 

0 . 

, 119 

0. 

,292 

0. 

,250 

0 . 

271 

0 . 

.228 

## 

CHU 

0 . 

. 179 

0 . 

,231 

0 , 

.185 

0 , 

.119 

0 . 

,000 

0. 

,290 

0 . 

,263 

0 . 

259 

0, 

,225 

## 

CPN 

0 . 

,281 

0. 

, 190 

0 , 

.302 

0 , 

.292 

0 . 

,290 

0. 

.000 

0. 

, 152 

0 . 

236 

0. 

,276 

## 

PSP 

0 . 

,250 

0. 

, 171 

0 , 

.281 

0 , 

.250 

0 . 

,263 

0. 

, 152 

0. 

.000 

0 . 

256 

0 , 

.237 

## 

BP 

0 . 

,267 

0. 

,269 

0 , 

.257 

0 , 

.271 

0 . 

,259 

0 . 

,236 

0. 

,256 

0 . 

000 

0. 

,274 

## 

D66 

0 . 

,230 

0. 

,204 

0 , 

. 174 

0 , 

.228 

0 . 

,225 

0 . 

,276 

0. 

,237 

0 . 

274 

0 , 

.000 


The trajectories from FDS to 2MDS show some clear movement, especially of the D’66 party, 
which was new at the time. 


## itel 223 lambda 
## itel 5 lambda 
## itel 2 lambda 
## itel 1 lambda 


0.000000 stress 0.000000 penalty 0.414526 
0.010000 stress 0.000061 penalty 0.196788 
0.020000 stress 0.000199 penalty 0.190472 
0.100000 stress 0.004399 penalty 0.136576 


14 



## 

itel 

1 

lambda 

0.200000 

## 

itel 

1 

lambda 

0.300000 

## 

itel 

1 

lambda 

0.400000 

## 

itel 

1 

lambda 

0.500000 

## 

itel 

1 

lambda 

0.520000 

## 

itel 

1 

lambda 

0.530000 

## 

itel 

277 

lambda 

0.540000 


stress 0.015811 penalty 0.075466 
stress 0.028235 penalty 0.036636 
stress 0.038275 penalty 0.012608 
stress 0.043644 penalty 0.002156 
stress 0.044091 penalty 0.001324 
stress 0.044253 penalty 0.001019 
stress 0.044603 penalty 0.000000 



There seems to be some bifurcation going on at the end, so we repeat the analysis using 
seq(0, 1 , length = 1001) for A. The results turn out to be basically the same. 


## 

itel 

223 

lambda 

## 

itel 

4 

lambda 

## 

itel 

2 

lambda 

## 

itel 

1 

lambda 

## 

itel 

1 

lambda 

## 

itel 

166 

lambda 


0.000000 stress 
0.001000 stress 
0.002000 stress 
0.468000 stress 
0.469000 stress 
0.470000 stress 


0.000000 

0.000001 

0.000002 

0.044604 

0.044604 

0.044603 


penalty 

penalty 

penalty 

penalty 

penalty 

penalty 


0.414526 

0.204225 

0.203535 

0.000001 

0.000000 

0.000000 
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-0.15 -0.05 0.05 0.15 


dim 1 


The singular values of the FDS solution are 2.95e-01, 2.10e-01, 1.89e-01, 1.34e-01, 1.16e-01, 
1.06e-01, 8.61e-02, 7.06e-02, 2.30e-18, and the Gower rank is probably eight. This is mainly 
because these data, being averages, regress to the mean and thus have a substantial additive 
constant. If we repeat the analysis after subtracting .1 from all dissimilarities we get basically 
the same solution, but with somewhat smoother trajectories. 


## 

itel 

511 

lambda 

## 

itel 

2 

lambda 

## 

itel 

2 

lambda 

## 

itel 

1 

lambda 

## 

itel 

1 

lambda 

## 

itel 

1 

lambda 


0.000000 stress 
0.001000 stress 
0.002000 stress 
0.370000 stress 
0.371000 stress 
0.372000 stress 


0.000176 

0.000176 

0.000176 

0.007642 

0.007642 

0.007642 


penalty 

penalty 

penalty 

penalty 

penalty 

penalty 


0.150789 

0.037759 

0.037619 

0.000000 

0.000000 

0.000000 
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-0.05 0.00 0.05 0.10 


dim 1 


Now the singular values of the FDS solution are 2.05e-01, 1.34e-01, l.lle-01, 6.03e-02, 3.11e- 
02, 3.97e-04, 1.18e-07, 4.55e-12, 2.10e-18, and the approximate Gower rank is more like five 
or six. 


Ekman 

The next example analyzes dissimilarities between 14 colors, taken from Ekman (1954). The 
original similarities Sij, averaged over 31 subjects, were transformed to dissimilarities by 

Sij 1 Sij . 


## 

itel 

1482 

lambda 

0.000000 

stress 

0.000088 

penalty 

0.426110 

## 

itel 

5 

lambda 

0.010000 

stress 

0.000132 

penalty 

0.118988 

## 

itel 

3 

lambda 

0.020000 

stress 

0.000253 

penalty 

0.112777 

## 

itel 

1 

lambda 

0.100000 

stress 

0.003195 

penalty 

0.070791 

## 

itel 

1 

lambda 

0.200000 

stress 

0.010778 

penalty 

0.024407 

## 

itel 

1 

lambda 

0.300000 

stress 

0.016125 

penalty 

0.003230 

## 

itel 

1 

lambda 

0.400000 

stress 

0.017142 

penalty 

0.000165 

## 

itel 

4 

lambda 

0.500000 

stress 

0.017213 

penalty 

0.000000 

## 

itel 

1 

lambda 

0.600000 

stress 

0.017213 

penalty 

0.000000 

## 

itel 

1 

lambda 

0.610000 

stress 

0.017213 

penalty 

0.000000 

## 

itel 

1 

lambda 

0.620000 

stress 

0.017213 

penalty 

0.000000 

## 

itel 

1 

lambda 

0.630000 

stress 

0.017213 

penalty 

0.000000 
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-0.10 -0.05 0.00 0.05 0.10 


dim 1 


If we transform the Ekman similarities by <5^ = (1 — Sij) 3 then its is known (De Leeuw (2016)) 
that the Gower rank is equal to two. Thus the FDS solution has rank 2, and the 2MDS 
solution is the global minimum. 


## itel 
## itel 
## itel 
## itel 
## itel 
## itel 
## itel 


99 lambda 
1 lambda 
1 lambda 
1 lambda 
1 lambda 
1 lambda 
1 lambda 


0.000000 

0.010000 

0.020000 

0.100000 

0.110000 

0.120000 

0.130000 


stress 0.011025 
stress 0.011025 
stress 0.011025 
stress 0.011025 
stress 0.011025 
stress 0.011025 
stress 0.011025 


penalty 0.433456 
penalty 0.000000 
penalty 0.000000 
penalty 0.000000 
penalty 0.000000 
penalty 0.000000 
penalty 0.000000 
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-0.10 -0.05 0.00 0.05 0.10 


dim 1 

Morse in Two 

Next, we use dissimilarities between 36 Morse code signals (Rothkopf (1957)). We used the 
symmetrized version morse from the smacof package (De Leeuw and Mair (2009)). 


## 

itel 

1461 

lambda 

0.000000 

stress 

0.000763 

penalty 

0.472254 

## 

itel 

6 

lambda 

0.010000 

stress 

0.000858 

penalty 

0.322181 

## 

itel 

4 

lambda 

0.020000 

stress 

0.001147 

penalty 

0.308335 

## 

itel 

1 

lambda 

0.100000 

stress 

0.008576 

penalty 

0.216089 

## 

itel 

1 

lambda 

0.200000 

stress 

0.028903 

penalty 

0.119364 

## 

itel 

1 

lambda 

0.300000 

stress 

0.051285 

penalty 

0.060060 

## 

itel 

1 

lambda 

0.400000 

stress 

0.068653 

penalty 

0.028190 

## 

itel 

1 

lambda 

0.500000 

stress 

0.080258 

penalty 

0.011356 

## 

itel 

1 

lambda 

0.600000 

stress 

0.086572 

penalty 

0.003578 

## 

itel 

1 

lambda 

0.700000 

stress 

0.089140 

penalty 

0.000854 

## 

itel 

1 

lambda 

0.800000 

stress 

0.089898 

penalty 

0.000116 

## 

itel 

1 

lambda 

0.830000 

stress 

0.089958 

penalty 

0.000053 

## 

itel 

1 

lambda 

0.840000 

stress 

0.089970 

penalty 

0.000040 

## 

itel 

197 

lambda 

0.850000 

stress 

0.089949 

penalty 

0.000000 
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-0.04 0.00 0.02 0.04 


dim 1 


Vegetables 


Our first one-dimensional example uses paired comparisons of 9 vegetables, originating with 
Guilford (1954) and taken from the psych package (Revelle (2018)). The proportions are 
transformed to dissimilarities by using the normal quantile function, i.e. = T -1 We 

use a short sequence for A. 


## itel 1412 lambda 
## itel 5 lambda 
## itel 5 lambda 
## itel 23 lambda 


0.000000 stress 
0.010000 stress 
0.100000 stress 
1.000000 stress 


0.013675 

0.013716 

0.016719 

0.035301 


penalty 0.269308 
penalty 0.114786 
penalty 0.069309 
penalty 0.000000 
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This example was previously analyzed in De Leeuw (2005) using enumeration of all permu¬ 
tations. He found 14354 isolated local minima, and a global minimum equal to the one we 
computed here. 


Plato 


Mail', Groenen, and De Leeuw (2019) use seriation of the works of Plato, from the data 
collected by Cox and Brandwood (1959), as an example of unidimensional scaling. We first 
run this example with our usual sequence of five A values. 


## itel 
## itel 
## itel 
## itel 
## itel 


169 lambda 
3 lambda 

3 lambda 

4 lambda 
9 lambda 


0.000000 

0.010000 

0.100000 

1.000000 

10.000000 


stress 0.000000 
stress 0.000062 
stress 0.005117 
stress 0.106058 
stress 0.139462 


penalty 0.410927 
penalty 0.255246 
penalty 0.194993 
penalty 0.019675 
penalty 0.000000 
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## 

## 

## 

## 

## 

## 

## 

## 


[1J 
[ 2 ,] 
[3,] 
[4,] 
[5,] 
[ 6 ,] 
[7J 


[, 1 ] 

"Timaeus" 

"Republic" 

"Critias" 

"Sophist" 

"Politicus 1 

"Philebus" 

"Laws" 



which is different from the order at the global minimum, which has Republic before Timaeus. 
Thus we have recovered a local minimum, and it seems our sequence of A values was not fine 
enough to do the job properly. So we try a longer and finer sequence. 


## 

itel 

169 

lambda 

0.000000 

stress 

0.000000 

penalty 

0.410927 

## 

itel 

3 

lambda 

0.000100 

stress 

0.000000 

penalty 

0.263015 

## 

itel 

3 

lambda 

0.001000 

stress 

0.000001 

penalty 

0.262280 

## 

itel 

3 

lambda 

0.010000 

stress 

0.000064 

penalty 

0.255078 

## 

itel 

3 

lambda 

0.100000 

stress 

0.005123 

penalty 

0.194945 

## 

itel 

2 

lambda 

0.200000 

stress 

0.016184 

penalty 

0.147493 

## 

itel 

1 

lambda 

0.300000 

stress 

0.026997 

penalty 

0.119323 

## 

itel 

1 

lambda 

0.400000 

stress 

0.040023 

penalty 

0.093615 

## 

itel 

1 

lambda 

0.500000 

stress 

0.053688 

penalty 

0.072330 

## 

itel 

1 

lambda 

0.600000 

stress 

0.066833 

penalty 

0.055452 

## 

itel 

1 

lambda 

0.700000 

stress 

0.078832 

penalty 

0.042269 

## 

itel 

1 

lambda 

0.800000 

stress 

0.089439 

penalty 

0.032019 

## 

itel 

1 

lambda 

0.900000 

stress 

0.098557 

penalty 

0.024079 
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## itel 1 lambda 1.000000 stress 0.106135 penalty 0.017940 
## itel 6 lambda 2.000000 stress 0.130789 penalty 0.000148 
## itel 13 lambda 3.000000 stress 0.131135 penalty 0.000000 


x 



4 

index 


7 


Now the order is 

## [, 1 ] 

## [1,] "Republic" 
## [2,] "Timaeus" 

## [3,] "Critias" 

## [4,] "Sophist" 

## [5,] "Politicus" 
## [6,] "Philebus" 
## [7,] "Laws" 


which does indeed correspond to the global minimum. 
With a different A sequence we find the same solution. 


## 

itel 

169 

lambda 

0.000000 

## 

itel 

3 

lambda 

0.001000 

## 

itel 

2 

lambda 

0.002000 

## 

itel 

2 

lambda 

0.004000 

## 

itel 

2 

lambda 

0.008000 

## 

itel 

2 

lambda 

0.016000 

## 

itel 

2 

lambda 

0.032000 

## 

itel 

2 

lambda 

0.064000 

## 

itel 

2 

lambda 

0.128000 

## 

itel 

2 

lambda 

0.256000 


stress 0.000000 penalty 0.410927 
stress 0.000001 penalty 0.262296 
stress 0.000003 penalty 0.261483 
stress 0.000010 penalty 0.259872 
stress 0.000041 penalty 0.256690 
stress 0.000159 penalty 0.250470 
stress 0.000613 penalty 0.238574 
stress 0.002266 penalty 0.216785 
stress 0.007791 penalty 0.180067 
stress 0.023483 penalty 0.127006 
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## itel 
## itel 
## itel 
## itel 


x 


2 lambda 0.512000 stress 0.056940 penalty 0.067948 

3 lambda 1.024000 stress 0.107743 penalty 0.017937 

8 lambda 2.048000 stress 0.131059 penalty 0.000032 

9 lambda 4.096000 stress 0.131135 penalty 0.000000 



7 


index 


The order is 

## [, 1 ] 

## [1,] "Republic" 
## [2,] "Timaeus" 

## [3,] "Critias" 

## [4,] "Sophist" 

## [5,] "Politicus" 
## [6,] "Philebus" 
## [7,] "Laws" 


Morse in One 

Now for a more challenging example. The Morse code data have been used to try out exact 
unidimensional MDS techniques, for example by Palubeckis (2013). We will enter the global 
minimum contest by using 10,000 values of A, in an equally spaced sequence from 0 to 10. 
This is not as bad as it sounds. For the 10,000 FDS solutions system.time () tells us 

## user system elapsed 
## 9.476 0.806 10.297 

The one-dimensional plot show quite a bit of movement, but much of it seems to be contained 
in the very first change of A. 
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We can also plot stress and the penalty term as functions of A. Again, note the big change in 
the penalty term when A goes from zero to 0.001. 
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lambda 

After the first 2593 values of A the penalty term is zero and we stop, i.e. we estimate A + is 
2.593. At that point we have run a total of 5013 FDS iterations, and thus on average about 
two iterations per A value. Stress has increased from 0.0007634501 to 0.2303106976 and the 
penalty value has decreased from 0.4815136419 to 0.0000000001. We find the following order 
of the points on the dimension. 

## [, 1 ] 

## [1,] 
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## [ 2 ,] 

## [3,] 

## [4,] 

## [5,] 

## [ 6 ,] " —" 

## [7,] 

## [ 8 ,] 

## [9,] 

## [ 10 ,] —" 

## [ 11 ,] "_" 

## [ 12 ,] ." 

## [13,] 

## [14,] " 

## [15,] "." 

## [16,] "_ 

## [17,] 

## [18,] " 

## [19,] 

## [ 20 ,] 

## [ 21 ,] _" 

## [ 22 ,] " . . 

## [23,] 

## [24,] " 

## [25,] " — . . . " 

## [26,] " — 

## [27,] " — 

## [28,] 

## [29,] " 

## [30,] " — 

## [31,] " —" 

## [32,] " 

## [33,] " — 

## [34,] " 

## [35,] "- 

## [36,] "-" 

Our order, and consequently our solution, is the same as the exact global solution given by 
Palubeckis (2013). See his table 4, reproduced below. The difference is that computing our 
solution takes 10 seconds, while his takes 494 seconds. But off course we would not know we 
actually found the global mimimum if the exact exhaustive methods had not analyzed the 
same data before. 
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Table 4: Optimal solutions for the full Morse code dissimilarity matrix and th 


i 

Morse code full 

P(i) 

x Pd) 

i 

(E). 

0.00000 

2 

(T) - 

5.83333 

3 

(I).. 

24.77778 

4 

(A) - 

34.08333 

5 

(N) - 

44.11111 

6 

(M)-- 

52.33333 

7 

(S) • • • 

70.30556 

8 

(U) • • - 

83.13889 

9 

(R) — 

93.47222 

10 

(W). - - 

102.97222 

11 

(H) • • •• 

114.44444 

12 

(D) - • • 

124.30556 

13 

(K) - • - 

131.52778 

14 

(V) . • - 

145.00000 

15 

(5). 

152.44444 

16 

(4) .... - 

159.91667 

17 

(F) • * — 

170.75000 

18 

(L) * - •• 

182.25000 

19 

(B) - • •• 

189.52778 

20 

(X)- 

199.55556 

21 

(6) — •... 

205.66667 

22 

(3)- 

220.13889 

23 

(C) - * — 

229.47222 

24 

(Y)- 

238.41667 

25 

(7) - - • • • 

249.30556 

26 

(Z) - 

254.88889 

27 

(Q) - 

264.38889 

28 

(P) - 

270.83333 

29 

(J) - 

282.94444 

30 

(G) - - • 

292.47222 

31 

(O) - 

300.22222 

32 

(2) - 

310.50000 

33 

(8) - • • 

320.36111 

34 

(1) - 

333.50000 

35 

(9) - 

341.86111 

36 

(0) - 

350.27778 


Discussion 

There is one surprising (to me, at least) finding from all our examples. There is a value, 
say A_|_, such that the penalty t(Y ) is zero for all PFDS solutions with A > A + . In other 
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words, our penalty function acts like a smooth exact penalty function. The precise reason for 
exactness in our case (if there is one) is not entirely clear to me yet, but it is obviously a 
topic for further research, using for example the recent theoretical framework of Dolgopolik 
(2016a), Dolgopolik (2016b), Dolgopolik (2017), Dolgopolik (2018). 

In our two dimensional examples we always start our plots with the first two dimensions of 
the FDS configuration. These two-dimensional configurations are usually small (all points 
relatively close to the origin), because so much variation is still in the higher dimensions. If 
A increases the growth of the configurations is one important aspect of configuration change. 

In our iteration counts with short sequences of A we see relatively small increases in stress 
and small decreases in the penalty term, until we get closer to A + , when we suddenly see a 
sudden change and a larger number of iterations. This is also reflected in the figures, where 
generally the change to the last solution (with the largest A) makes the largest jump. This 
suggest a finer sequence near A + and perhaps an adaptive strategy for choosing A. Or to 
use brute force, as in the unidimensional Morse code example. With such longer and finer 
sequences convergence becomes more smooth. 

Another all-important aspect of the method discussed here is that it assumes computation of 
the global minimum for each A. Since we cannot expect a result as nice as the one for FDS 
(all local minima are global) for A > 0 our method remains somewhat heuristic. We have 
seen that some sequences of A can take us to a non-global local minimum. Of course the fact 
that we start with a global minimum for A = 0 is of some help, but we do not know how far 
it will take us in general. Jumps near A + may indicate bifurcations to other local minima. 

We have not stressed in the paper that minimizing the penalty function is a continuation 
method (Allgower and George (1979)). This means that probably better methods are available 
to follow the trajectory of solutions along A > 0. There are also possibilities in exploring the 
fact that the maximum over Z of the penalty function (12) is a concave function of the single 
variable A, which is a constant function for all A > A + . There is a duality theory associated 
with these Courant penalty functions, which we have not used or explored so far. 


Appendix A: Exterior Penalty Methods 

Suppose X C M n and / : K n M is continuous. Define 

W = argmin f{x) 

Suppose W is non-empty and that x* is any element of A*, and 

/* = f(x*) = min f(x). 

The following convergence analysis of external linear penalty methods is standard and can be 
found in many texts (for example, Zangwill (1969), section 12.2). 

The penalty term g : M n =>■ M + is continuous and satisfies g(x) = 0 if and only if x e X. For 
each A > 0 we define the (linear, external) penalty function 

h(x, A) = f(x) + A g(x). 

28 


(18) 



Suppose {Afc} is a strictly increasing sequence of positive real numbers. Define 

X k — argrnin h(x, X k ). (19) 

Suppose all X k are nonempty and contained in a compact subset of X. Choose x k G X k 
arbitrarily. 

Lemma 2: [Basic] 

1 . h(x k) \ k ) ^ h(x k -\-\, ). 

2 : > g(x k+ 1 ). 

3: /(a;*,) < f(x k + 1 ). 

4: /* > /i(x fc , Afc) > /(x fe ). 

Proof: 

1: We have the chain 

h(x k + 1 , A fc+ i) = 7 (^+ 1 ) +A fc+ i 5 f(a; fc+ i) > /(x fc+ i) + A^^+i) > f(x k ) + X k g(x k ) = h(x k ,X k ). 
2: Both 

f(x k ) + A k g(x k ) < f(x k+ 1 ) + X k g(x k+ i), ( 20 ) 

f(x k + 1 ) + X k+ ig{x k+1 ) < f(x k ) + A fc+ i^(x fe ). ( 21 ) 

Adding inequalities (20) and ( 21 ) gives 

X k g{x k ) + X k+ ig(x k+ i) < X k g(x k+ i) + X k+ ig(x k ), 

or 

(Afc X k+ i)g(x k ) 7 (Afc Afc_|_i)( 7 ( 3 :fc-|_i), 

and thus g(x k ) > g(x k+1 ). 

3: First 

f(x k+ 1 ) + X k g(x k+1 ) > f(x k ) + A k g{x k ). ( 22 ) 

We just proved that g(x k +i) > g(x k ), and thus 

f(x k ) + Afcfi'(xfc) > f(x k ) + X k g(x k+ i). (23) 

Combining inequalities (22) and (23) gives f(x k+ 1 ) > f(x k ). 

4: We have the chain 

/* = fix*) + X k g(x*) > f(x k ) + X k g(x k ) > f(x k ). 

■ 

Theorem 3: Suppose the sequence {Afc}fc e ^ diverges to 00 and x ** is the limit of any 
convergent subsequence {xt}e £ L- Then x ** G X*, and f(x **) = /*, and g(x **) = 0. 
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Proof: Using part 4 of lemma 2 


lim h(x t , Xi) = Hm {f(x t ) + A t g(x t )} = /(x* 


r) + limA^(^) < /(x*). 


Thus {h(xt, X^i^l} is a bounded increasing sequence, which consequently converges, and 
lim^A (,g{x() also converges. Since {Xi}e £ l —> oo it follows that lim = g(x **) = 0. 
Thus x** e X. Since f(x e ) < /* we see that /(x**) < /*, and thus x** e T* and /(x**) = /*. 


Appendix B: Code 

penalty. R 

smacofLoss <- function (d, w, delta) { 
return (sum (w * (delta - d) ~ 2) / 4) 

} 

smacofBmat <- function (d, w, delta) { 
dd <- ifelse (d == 0, 0, 1 / d) 
b <- -dd * w * delta 
diag (b) <- -rowSums (b) 
return(b) 

> 

smacofVmat <- function (w) { 
v <- -w 

diag(v) <- -rowSums (v) 
return (v) 

} 

smacofGuttman <- function (x, b, vinv) { 
return (vinv 1*1 b 1*1 x) 

> 

columnCenter <- function (x) { 

return (apply (x, 2, function (z) z - mean (z))) 

} 

smacofComplement <- function (y, v) { 
return (sum (v * tcrossprod (y)) / 4) 

> 


smacofPenalty <- 
function (w. 
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delta, 

p = 2, 

lbd = 0, 

zold = columnCenter (diag (nrow (delta))), 
itmax = 10000, 
eps = le-10, 
verbose = FALSE) { 
itel <- 1 
n <- nrow (zold) 
vmat <- smacofVmat (w) 

vinv <- solve (vmat +(l/n))-(l/n) 
dold <- as.matrix (dist (zold)) 

mold <- sum (w * delta * dold) / sum (w * dold * dold) 

zold <- zold * mold 

dold <- dold * mold 

yold <- zold [, (p + 1) : n] 

sold <- smacofLoss (dold, w, delta) 

bold <- smacofBmat (dold, w, delta) 

told <- smacofComplement (yold, vmat) 

uold <- sold + lbd * told 

repeat { 

znew <- smacofGuttman (zold, bold, vinv) 
ynew <- znew [, (p+1) : n] / (1 + lbd) 
znew [, (p + 1) : n] <- ynew 
xnew <- znew [, 1:p] 
dnew <- as.matrix (dist (znew)) 
bnew <- smacofBmat (dnew, w, delta) 
tnew <- smacofComplement (ynew, vmat) 
snew <- smacofLoss (dnew, w, delta) 
unew <- snew + lbd * tnew 
if (verbose) { 
cat ( 

"itel ", 

f ormatC(itel , width = 4, format = "d"), 

"sold ", 
formate ( 
sold, 

width = 10, 
digits = 6, 
format = "f" 

), 

"snew ", 
formate ( 
snew, 
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width = 10, 
digits = 6, 
format = "f" 

), 

"told ", 
formate ( 
told, 

width = 10, 
digits = 6, 
format = "f" 

), 

"tnew " , 
formate ( 
tnew, 

width = 10, 
digits = 6, 
format = "f" 

), 

"uold ", 
formate ( 
uold, 

width = 10, 
digits = 6, 
format = "f" 

), 

"unew " , 
formate ( 
unew, 

width = 10, 
digits = 6, 
format = "f" 

), 

"\n" 

) 

} 

if (((uold - unew) < eps) I I (itel == itmax)) { 
break 

} 

itel <- itel + 1 
zold <- znew 
bold <- bnew 
sold <- snew 
told <- tnew 
uold <- unew 
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> 

zpri <-znew °/ 0 *°/o svd(znew)$v 
xpri <- zpri[, 1: p] 
return (list ( 
x = xpri, 
z = zpri, 
b = bnew, 

1 = lbd, 
s = snew, 
t = tnew, 
itel = itel 

)) 


runPenalty.R 

runPenalty <- 
function (w, 

delta, 

lbd, 

p = 2, 

itmax = 10000, 
eps = le-10, 
cut = le-6, 
write = TRUE, 
verbose = FALSE) { 
m <- length (lbd) 
hList <- as.list (l:m) 
hList[[l]] <- 
smacofPenalty ( 
w, 

delta, 

P. 

lbd = lbd[l], 
itmax = itmax, 
eps = eps, 
verbose = verbose 

) 

for (j in 2:m) { 
hList [[j]] <- 
smacofPenalty ( 
w, 

delta. 


zold = hList[[j - l]]$z, 

lbd = lbd [j] , 

itmax = itmax, 

eps = eps, 

verbose = verbose 

) 

> 

mm <- m 

for (i in l:m) { 

if (write) { 
cat ( 

"itel" , 

f ormatC(hList [[i]]$itel, width = 4, format = "d") , 
"lambda" , 
formate ( 

hList[[i]] $1 , 
width = 10, 
digits = 6, 
format = "f" 

), 

"stress" , 
formate ( 

hList[[i]] $s , 
width = 8, 
digits = 6, 
format = "f" 

), 

"penalty" , 
formate ( 

hList[[i]] $t , 
width = 8, 
digits = 6, 
format = "f" 

), 

"\n" 

) 

} 

if (hList[[i]] $t < cut) { 
mm <- i 
break 

} 

> 

return(hList[1 :mm]) 
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writeSelected <- function (hList, ind) { 
m <- length (hList) 
n <- length (ind) 

mn <- sort (union (union (1:3, ind), m - (2:0))) 
for (i in mn) { 

if (i > m) { 
next 

> 

cat ( 

"itel" , 

f ormatC(hList [[i]] $itel , width = 4, format = "d"), 
"lambda" , 
formate ( 

hList[[i]] $1 , 
width = 10, 
digits = 6, 
format = "f" 

), 

"stress" , 
formate ( 

hList[[i]] $s , 
width = 8, 
digits = 6, 
format = "f" 

), 

"penalty" , 
formate ( 

hList [ [i]] $t , 
width = 8, 
digits = 6, 
format = "f" 

), 

"\n" 

) 

} 

} 

matchMe.R 

matchMe <- function (x, 

itmax = 100, 
eps = le-10, 
verbose = FALSE) { 
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m <- length (x) 
y <- sumList (x) / m 
itel <- 1 

fold <- sum (sapply (x, function (z) 

(z - y) 2)) 
repeat { 

for (j in l:m) { 

u <- crossprod (x[[j]j, y) 
s <- svd (u) 

r <- tcrossprod (s$u, s$v) 
x[[j]] <- x[ [j] ] 1*1 r 

> 

y <- sumList (x) / m 

fnew <- sum (sapply (x, function (z) 

(z - y) 2)) 
if (verbose) { 

> 

if (((fold - fnew) < eps) I I (itel == itmax)) 
break 

itel <- itel + 1 
fold <- fnew 

} 

return (x) 

} 

sumList <- function (x) { 
m <- length (x) 
y <- x[[l]] 
for (j in 2:m) { 

y <~ y + x[[j]] 

} 

return (y) 


plotMe.R 

plotMe2 <- f unction(hList , labels, s = 1, t = 2) { 
n <- nrow(hList [[1]] $x) 
m <- length (hList) 
par(pty = "s") 

hMatch <- matchMe (lapply (hList, function(r) 
r$x) ) 
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hMat <- matrix (0, 0, 2) 
for (j in l:m) { 

hMat <- rbind(hMat, hMatch[[j]][, c(s, t)]) 

} 

plot (hMat, 

xlab = "dim 1", 
ylab = "dim 2", 

col = c(rep("RED", n*(m-l)), repC'BLUE", n)), 
cex = c(rep(l, n*(m-l)), rep(2, n))) 
for (i in l:n) { 

hLine <- matrix (0, 0, 2) 
for (j in l:m) { 

hLine <- rbind (hLine, hMatch[[j]][i, c(s, t)]) 

> 

lines (hLine) 

} 

text (hMatch[[m]], labels, cex = .75) 


plotMel <- f unction(hList , labels) { 
n <- length (hList[ [1] ] $x) 
m <- length (hList) 
blow <- function (x) { 
n <- length (x) 

return (matrix (c(l:n, x), n, 2)) 

} 

hMat <- matrix (0, 0, 2) 
for (j in l:m) { 

hMat <- rbind (hMat, blow (hList[ [j] ] $x) ) 

} 

plot (hMat, 

xlab = "index", 
ylab = "x" , 

col = c(rep("RED", n*(m-l)), repC'BLUE", n)), 
cex = c(rep(l, n*(m-l)), rep(2, n))) 
for (i in l:n) { 

hLine <- matrix (0, 0, 2) 
for (j in l:m) { 

hLine <- rbind (hLine, blow (hList[ [j] ] $x) [i, ]) 
lines (hLine) 

> 

} 

text(blow (hList[[m]] $x) , labels, cex = 1.00) 
for (i in l:n) { 
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abline(h = hList[[m]]$x[i]) 

> 

} 

checkUni.R 

checkUni <- function (w, delta, x) { 
x <- drop (x) 
n <- length (x) 

vinv <- solve (smacofVmat (w) +(l/n))-(l/n) 

return (drop (vinv %*% rowSums (w * delta * sign (outer (x, x, "-"))))) 

} 
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